Load balancing between multiple web servers

ABSTRACT

A system for load balancing in a network environment including a plurality of network resources coupled to a network wherein at least some of the network resources provide redundant services. A client is coupled to the network and a gateway machine is coupled to the network in communication with the client. The gateway machine is configured to receive requests from the client, establish communication channel through the network with a network resource specified by the client, and access the specified network resources to service the received client requests. The gateway machine includes or is coupled to mechanisms for selecting amongst providers of redundant services a particular provider for a received request so as to balance load amongst the plurality of resources providing redundant services.

RELATED APPLICATIONS

[0001] The present invention claims priority from U.S. ProvisionalPatent Application No. 60/197,490 entitled CONDUCTOR GATEWAY filed onApr. 17, 2000.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present invention relates, in general, to network informationaccess and, more particularly, to software, systems and methods forserving web pages in a coordinated fashion from multiple cooperating webservers.

[0004] 2. Relevant Background

[0005] Increasingly, business data processing systems, entertainmentsystems, and personal communications systems are implemented bycomputers across networks that are interconnected by internetworks(e.g., the Internet). The Internet is rapidly emerging as the preferredsystem for distributing and exchanging data. Data exchanges supportapplications including electronic commerce, broadcast and multicastmessaging, videoconferencing, gaming, and the like.

[0006] The Internet is a collection of disparate computers and networkscoupled together by a web of interconnections using standardizedcommunication protocols. The Internet is characterized by its vast reachas a result of its wide and increasing availability and easy accessprotocols. Unfortunately, the heterogenous nature of the Internetresults in variable bandwidth and quality of service between points. Thelatency and reliability of data transport is largely determined by thetotal amount of traffic on the Internet and so varies wildly seasonallyand throughout the day. Other factors that affect quality of serviceinclude equipment outages and line degradation that force packets to bererouted, damaged and/or dropped. Also, routing software and hardwarelimitations within the Internet infrastructure may create bandwidthbottlenecks, even when the mechanisms are operating withinspecifications.

[0007] Internet transport protocols do not discriminate between users.Data packets are passed between routers and switches that make up theInternet fabric based on the hardware's instantaneous view of the bestpath between source and destination nodes specified in the packet.Because each packet may take a different path, the latency of a packetcannot be guaranteed and, in practice, varies significantly. Likewise,data packets are routed through the Internet without any prioritizationbased on content.

[0008] Prioritization has not been an issue with conventional networkssuch as local area networks (LANs) and wide area networks (WANs) becausethe average latency of such networks has been sufficiently low andsufficiently uniform to provide acceptable performance. However, thereis an increasing demand for network applications that cannot toleratehigh and variable latency. This situation is complicated when theapplication is to be run over the Internet where latency and variabilityin latency are many times greater than in LAN and WAN environments.

[0009] A particular need exists in environments that involve multipleusers accessing a network resource such as a web server. Examplesinclude broadcast, multicast and videoconferencing as well as mostelectronic commerce (e-commerce) applications. In these applications, itis important to maintain a reliable connection so that the server andclients remain synchronized and information is not lost.

[0010] In electronic commerce (e-commerce) applications, it is importantto provide a satisfying buying experience that leads to a purchasetransaction. To provide this high level of service, a web site operatormust ensure that data is delivered to the customer in the most usableand efficient fashion. Also, the web site operator must ensure thatcritical data received from the customer is handled with priority.

[0011] Until now, however, the e-commerce site owner has had little orno control over the transport mechanisms through the Internet thataffect the latency and quality of service. This is akin to a retailerbeing forced to deal with a customer by shouting across the street,never certain how often what was said must be repeated, and knowing thatduring rush hour communication would be nearly impossible. While effortsare continually being made to increase the capacity and quality ofservice afforded by the Internet, it is contemplated that congestionwill always impact the ability to predictably and reliably offer aspecified level of service. Moreover, the change in the demand forbandwidth increases at a greater rate than does the change in bandwidthsupply, ensuring that congestion will continue to be an issue into theforeseeable future. A need exists for a system to exchange data over theInternet that provides a high quality of service even during periods ofcongestion.

[0012] Many e-commerce transactions are abandoned by the user becausesystem performance degradations frustrate the purchaser before thetransaction is consummated. While a transaction that is abandoned whilea customer is merely browsing through a catalog may be tolerable,abandonment when the customer is just a few clicks away from a purchaseis highly undesirable. However, existing Internet transport mechanismsand systems do not allow the e-commerce site owner any ability todistinguish between the “just browsing” and the “about-to-buy”customers. In fact, the vagaries of the Internet may lead to the casualbrowser receiving a higher quality of service while the about-to-buycustomer becomes frustrated and abandons the transaction.

[0013] Web sites are often implemented on a plurality of web serverswhich may or may not be running on separate hosting machines. Each webserver handles a set of content and some load distribution software runson top of the multiple web servers to direct requests to the web serverthat can handle the request. The multiple servers essentially act aspeers with each server having a set of resources from which it servesrequests. It is difficult to distribute load efficiently amongst themultiple servers, however. One or more servers may experience heavytraffic while other servers remain essentially idle, unable to assistthe overworked servers. This results in poor performance.

[0014] Several load balancing solutions have been proposed for web sitesincluding, for example, webQOS offered by Hewlett Packard Company.Another recent load balancing solution described in U.S. Pat. No.5,894,554 uses a set of “page servers” to offload some page generationfunctionality from the originating web server. These page servers remainin the IP address domain of the originating web server and so areclosely coupled to the web server itself. Such solutions implement alogical layer over all the available web servers that receive incomingrequests and route the requests intelligently to available web serversat a rate that allows the web servers to operate more efficiently.However, these solutions are generally operative after a client requesthas been transported through the network and received at the web siteand so do not affect load balancing through the network itself.Moreover, such solutions do not extend well to web servers that aregeographically and/or toplogically distributed. A need exists for a loadbalancing mechanism that provides efficient load balancing throughout acommunication network.

[0015] Also, current load balancing techniques operate on arequest-by-request basis and so cannot readily direct requests in amanner that balances load at a more abstract level over whole systemsand web sites. Although in many cases web sites are implemented in astateless fashion so that requests can be handled on arequest-by-request basis, some optimization can be obtained by statefulsessions. A need exists for load balancing systems, methods and softwarethat work cooperatively with a web site to balance load on other than arequest-by-request basis.

SUMMARY OF THE INVENTION

[0016] Briefly stated, the present invention involves a system for loadbalancing in a network environment including a plurality of networkresources coupled to a network wherein at least some of the networkresources provide redundant services. A client is coupled to the networkand a gateway machine is coupled to the network in communication withthe client. The gateway machine is configured to receive requests fromthe client, establish a communication channel through the network with anetwork resource specified by the client, and access the specifiednetwork resources to service the received client requests. The gatewaymachine includes or is coupled to mechanisms for selecting amongstproviders of redundant services a particular provider for a receivedrequest so as to balance load amongst the plurality of resourcesproviding redundant services.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017]FIG. 1 illustrates a general distributed computing environment inwhich the present invention is implemented;

[0018]FIG. 2 shows in block-diagram form significant components of asystem in accordance with the present invention;

[0019]FIG. 3 shows a domain name system used in an implementation of thepresent invention;

[0020]FIG. 4 shows front-end components of FIG. 2 in greater detail;

[0021]FIG. 5 shows back-end components of FIG. 2 in greater detail;

[0022]FIG. 6 shows a conceptual block diagram of the system of FIG. 2 inan alternative context; and

[0023]FIG. 7 illustrates an alternative embodiment of the presentinvention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0024] The present invention is illustrated and described in terms of adistributed computing environment such as an enterprise computing systemusing public communication channels such as the Internet. However, animportant feature of the present invention is that it is readily scaledupwardly and downwardly to meet the needs of a particular application.Accordingly, unless specified to the contrary, the present invention isapplicable to significantly larger, more complex network environments,including wireless network environments, as well as small networkenvironments such as conventional LAN systems.

[0025] In accordance with the present invention, load balancing isperformed on the front-end, before a network connection. Prior systemsload balance at the back-end after the connection, and cannot select achannel. One feature of the present invention is that it provides a wayto improve use of multiple channels through the network. The loadbalancing decisions are made at an intermediary server that is locatedon the network topology at a location logically closer to the client orsoftware application making the request than the web server itself. Thisenables the present invention to not only balance load amongst multipleweb servers in a web site hosting center, but also to balance loadacross multiple available communication channels based on, for example,quality of service provided on the available channels.

[0026] One feature of the present invention is that the front-endservers are in separate IP address domains from the originating webserver. A redirection mechanism is enabled to select from a pool ofavailable front-end servers and direct client request packets from theoriginating web server to a selected front-end server. Preferably, thefront-end server establishes and maintains an enhanced communicationchannel with the originating web server. By enhanced, it is meant thatthe channel offers improved quality of service, lower latency,prioritization services, higher security transport, or other featuresand services that improve upon the basic transport mechanisms (such asTCP) defined for Internet data transport.

[0027] In this manner, the load balancing functionality can be performedbefore the request is launched across the public network. A front-endthat is logically close to the client process that is requesting serviceis selected from the pool of available front-end servers. The selectedfront-end is configured to provide the enhanced channel to theoriginating web site using, for example, a back-end server. An enhancedchannel may already exist and such existence may be a criteria used toselect a particular front-end server from the pool of front-end servers.

[0028] For purposes of this document, a web server is a computer orsystem of computers running server software coupled to the World WideWeb (i.e., “the web”) that delivers or serves web pages. The web serverhas a unique IP address and accepts connections in order to servicerequests by sending back responses. A web server differs from a proxyserver or a gateway server in that a web server has resident a set ofresources (i.e., software programs, data storage capacity, and/orhardware) that enable it to execute programs to provide an extensiblerange of functionality such as generating web pages, accessing remotenetwork resources, analyzing contents of packets, reformattingrequest/response traffic and the like using the resident resources. Incontrast, a proxy simply forwards request/response traffic on behalf ofa client to resources that reside elsewhere, or obtains resources from alocal cache if implemented. A web server in accordance with the presentinvention may reference external resources of the same or different typeas the services requested by a user, and reformat and augment what isprovided by the external resources in its response to the user.Commercially available web server software includes Microsoft InternetInformation Server (IIS), Netscape Netsite, Apache, among others.Alternatively, a web site may be implemented with custom or semi-customsoftware that supports HTTP traffic.

[0029]FIG. 1 shows an exemplary computing environment 100 in which thepresent invention may be implemented. Environment 100 includes aplurality of local networks such as Ethernet network 102, FDDI network103 and Token Ring network 104. Essentially, a number of computingdevices and groups of devices are interconnected through a network 101.For example, local networks 102, 103 and 104 are each coupled to network101 through routers 109. LANs 102, 103 and 104 may be implemented usingany available topology and may implement one or more server technologiesincluding, for example UNIX, Novell, or Windows NT networks, includingboth client-server and peer-to-peer type networks. Each network willinclude distributed storage implemented in each device and typicallyincludes some mass storage device coupled to or managed by a servercomputer. Network 101 comprises, for example, a public network such asthe Internet or another network mechanism such as a fiber channel fabricor conventional WAN technologies.

[0030] Local networks 102, 103 and 104 include one or more networkappliances 107. One or more network appliances 107 may be configured asan application and/or file server. Each local network 102, 103 and 104may include a number of shared devices (not shown) such as printers,file servers, mass storage and the like. Similarly, devices 111 may beshared through network 101 to provide application and file services,directory services, printing, storage, and the like. Routers 109 providea physical connection between the various devices through network 101.Routers 109 may implement desired access and security protocols tomanage access through network 101.

[0031] Network appliances 107 may also couple to network 101 throughpublic switched telephone network 108 using copper or wirelessconnection technology. In a typical environment, an Internet serviceprovider 106 supports a connection to network 101 as well as PSTN 108connections to network appliances 107.

[0032] Network appliances 107 may be implemented as any kind of networkappliance having sufficient computational function to execute softwareneeded to establish and use a connection to network 101. Networkappliances 107 may comprise workstation and personal computer hardwareexecuting commercial operating systems such as Unix variants, MicrosoftWindows, Macintosh OS, and the like. At the same time, some appliances107 comprise portable or handheld devices using wireless connectionsthrough a wireless access provider such as personal digital assistantsand cell phones executing operating system software such as PalmOS,WindowsCE, and the like. Moreover, the present invention is readilyextended to network devices such as office equipment, vehicles, andpersonal communicators that make occasional connection through network101.

[0033] Each of the devices shown in FIG. 1 may include memory, massstorage, and a degree of data processing capability sufficient to managetheir connection to network 101. The computer program devices inaccordance with the present invention are implemented in the memory ofthe various devices shown in FIG. 1 and enabled by the data processingcapability of the devices shown in FIG. 1. In addition to local memoryand storage associated with each device, it is often desirable toprovide one or more locations of shared storage such as disk farm (notshown) that provides mass storage capacity beyond what an individualdevice can efficiently use and manage. Selected components of thepresent invention may be stored in or implemented in shared massstorage.

[0034] The present invention operates in a manner akin to a privatenetwork 200 implemented within the Internet infrastructure. Privatenetwork 200 expedites and prioritizes communications between a client205 and a web site 210. In the specific examples herein, client 205comprises a network-enabled graphical user interface such as a WorldWide Web browser. However, the present invention is readily extended toclient software other than conventional web browser software. Any clientapplication that can access a standard or proprietary user levelprotocol for network access is a suitable equivalent. Examples includeclient applications for file transfer protocol (FTP) services, voiceover Internet protocol (VOIP) services, network news protocol (NNTP)services, multi-purpose internet mail extensions (MIME) services, postoffice protocol (POP) services, simple mail transfer protocol (SMTP)services, as well as Telnet services. In addition to network protocols,the client application may access a network application such as adatabase management system (DBMS) in which case the client applicationgenerates query language (e.g., structured query language or “SQL”)messages. In wireless appliances, a client application may communicatevia a wireless application protocol (WAP) or the like.

[0035] For convenience, the term “web site” is used interchangeably with“web server” in the description herein, although it should be understoodthat a web site comprises a collection of content, programs andprocesses implemented on one or more web servers. A web site is owned bythe content provider such as an e-commerce vendor whereas a web serverrefers to set of programs running on one or more machines coupled to anInternet node. The web site 210 may be hosted on the site owner's ownweb server, or hosted on a web server owned by a third party. A webhosting center is an entity that implements one or more web sites on oneor more web servers using shared hardware and software resources acrossthe multiple web sites. In a typical web infrastructure, there are manyweb browsers, each of which has a TCP connection to the web server inwhich a particular web site is implemented. The present invention addstwo components to the infrastructure: a front-end 201 and back-end 203.Front-end 201 and back-end 203 are coupled by a managed datacommunication link 202 that forms, in essence, a private network.

[0036] Front-end mechanism 201 serves as an access point for client-sidecommunications. Front-end 201 implements a gateway that functions as aproxy for the web server(s) implementing web site 210 (i.e., from theperspective of client 205, front-end 201 appears to be the web site210). Front-end 201 comprises, for example, a computer that sits “close”to clients 205. By “close”, it is meant that the average latencyassociated with a connection between a client 205 and a front-end 201 isless than the average latency associated with a connection between aclient 205 and a web site 210. Desirably, front-end computers have asfast a connection as possible to the clients 205. For example, thefastest available connection may be implemented in point of presence(POP) of an Internet service provider (ISP) 106 used by a particularclient 205. However, the placement of the front-ends 201 can limit thenumber of browsers that can use them. Because of this, in someapplications it is more practical to place one front-end computer insuch a way that several POPs can connect to it. Greater distance betweenfront-end 201 and clients 205 may be desirable in some applications asthis distance will allow for selection amongst a greater numberfront-ends 201 and thereby provide significantly different routes to aparticular back-end 203. This may offer benefits when particular routesand/or front-ends become congested or otherwise unavailable.

[0037] Transport mechanism 202 is implemented by cooperative actions ofthe front-end 201 and back-end 203. Back-end 203 processes and directsdata communication to and from web site 210. Transport mechanism 202communicates data packets using a proprietary protocol over the publicInternet infrastructure in the particular example. Hence, the presentinvention does not require heavy infrastructure investments andautomatically benefits from improvements implemented in the generalpurpose network 101. Unlike the general purpose Internet, front-end 201and back-end 203 are programmably assigned to serve accesses to aparticular web site 210 at any given time.

[0038] It is contemplated that any number of front-end and back-endmechanisms may be implemented cooperatively to support the desired levelof service required by the web site owner. The present inventionimplements a many-to-many mapping of front-ends to back-ends. Becausethe front-end to back-end mappings can by dynamically changed, a fixedhardware infrastructure can be logically reconfigured to map more orfewer front-ends to more or fewer back-ends and web sites or servers asneeded.

[0039] Front-end 201 together with back-end 203 function to reducetraffic across the transport morphing protocol™ (TMP™) link 202 and toimprove response time for selected browsers. Transport morphing protocoland TMP are trademarks or registered trademarks of CircadenceCorporation in the United States and other countries. Traffic across theTMP link 202 is reduced by compressing data and serving browser requestsfrom cache for fast retrieval. Also, the blending of request datagramsresults in fewer request:acknowledge pairs across the TMP link 202 ascompared to the number required to send the packets individually betweenfront-end 201 and back-end 203. This action reduces the overheadassociated with transporting a given amount of data, althoughconventional request:acknowledge traffic is still performed on the linkscoupling the front-end 201 to client 205 and back-end 203 to a webserver. Moreover, resend traffic is significantly reduced furtherreducing the traffic. Response time is further improved for selectprivileged users and for specially marked resources by determining thepriority for each HTTP transmission.

[0040] In one embodiment, front-end 201 and back-end 203 are closelycoupled to the Internet backbone. This means they have high bandwidthconnections, can expect fewer hops, and have more predictable packettransit time than could be expected from a general-purpose connection.Although it is preferable to have low latency connections betweenfront-ends 201 and back-ends 203, a particular strength of the presentinvention is its ability to deal with latency by enabling efficienttransport and traffic prioritization. Hence, in other embodimentsfront-end 201 and/or back-end 203 may be located farther from theInternet backbone and closer to clients 205 and/or web servers 210. Suchan implementation reduces the number of hops required to reach afront-end 201 while increasing the number of hops within the TMP link202 thereby yielding control over more of the transport path to themanagement mechanisms of the present invention.

[0041] Clients 205 no longer conduct all data transactions directly withthe web server 210. Instead, clients 205 conduct some and preferably amajority of transactions with front-ends 201, which simulate thefunctions of web server 210. Client data is then sent, using TMP link202, to the back-end 203 and then to the web server 210. Runningmultiple clients 205 over one large connection provides severaladvantages:

[0042] Since all client data is mixed, each client can be assigned apriority. Higher priority clients, or clients requesting higher prioritydata, can be given preferential access to network resources so theyreceive access to the channel sooner while ensuring low-priority clientsreceive sufficient service to meet their needs.

[0043] The large connection between a front-end 201 and back-end 203 canbe permanently maintained, shortening the many TCP/IP connectionsequences normally required for many clients connecting anddisconnecting.

[0044] Using a proprietary protocol allows the use of more effectivetechniques to improve data throughput and makes better use of existingbandwidth during periods when the network is congested.

[0045] A particular advantage of the architecture shown in FIG. 2 isthat it is readily scaled. Any number of client machines 205 may besupported. In a similar manner, a web site owner may choose to implementa site using multiple web servers 210 that are co-located or distributedthroughout network 101. To avoid congestion, additional front-ends 201may be implemented or assigned to particular web sites. Client trafficis dynamically directed to available front-ends 201 to provide loadbalancing. Hence, when quality of service drops because of a largenumber of client accesses, an additional front-end 201 can be assignedto the web site and subsequent client requests directed to the newlyassigned front-end 201 to distribute traffic across a broader base.

[0046] In the particular examples, this is implemented by a front-endmanager component 207 that communicates with multiple front-ends 201 toprovide administrative and configuration information to front-ends 201.Each front-end 201 includes data structures for storing theconfiguration information, including information identifying the IPaddresses of web servers 210 to which they are currently assigned. Otheradministrative and configuration information stored in front-end 201 mayinclude information for prioritizing data from and to particularclients, quality of service information, and the like.

[0047] Similarly, additional back-ends 203 can be assigned to a web siteto handle increased traffic. Back-end manager component 209 couples toone or more back-ends 203 to provide centralized administration andconfiguration service. Back-ends 203 include data structures to holdcurrent configuration state, quality of service information and thelike. In the particular examples front-end manager 207 and back-endmanager 209 serve multiple web sites 210 and so are able to manipulatethe number of front-ends and back-ends assigned to each web site 210 byupdating this configuration information. When the congestion for thesite subsides, the front-end 201 and back-end 203 can be reassigned toother, busier web sites. These and similar modifications are equivalentto the specific examples illustrated herein.

[0048] In the case of web-based environments, front-end 201 isimplemented using custom or off-the-shelf web server software. Front-end201 is readily extended to support other, non-web-based protocols,however, and may support multiple protocols for varieties of clienttraffic. Front-end 201 processes the data traffic it receives,regardless of the protocol of that traffic, to a form suitable fortransport by TMP 202 to a back-end 203. Hence, most of the functionalityimplemented by front-end 201 is independent of the protocol or format ofthe data received from a client 205. Hence, although the discussion ofthe exemplary embodiments herein relates primarily to front-end 201implemented as a web server, it should be noted that, unless specifiedto the contrary, web-based traffic management and protocols are merelyexamples and not a limitation of the present invention.

[0049] As shown in FIG. 2, in accordance with the present invention aweb site is implemented using an originating web server 210 operatingcooperatively with the web server of front-end 201. More generally, anynetwork service (e.g., FTP, VoIP, NNTP, MIME, SMTP, Telnet, DBMS) can beimplemented using a combination of an originating server workingcooperatively with a front-end 201 configured to provide a suitableinterface (e.g., FTP, VoIP, NNTP, MIME, SMTP, Telnet, DBMS, WAP) for thedesired service. In contrast to a simple front-end cache or proxysoftware, implementing a server in front-end 201 enables portions of theweb site (or other network service) to actually be implemented in andserved from both locations. The actual web pages or service beingdelivered comprises a composite of the portions generated at eachserver. Significantly, however, the web server in front-end 201 is closeto the browser in a client 205 whereas the originating web server isclose to all resources available at the web hosting center at which website 210 is implemented. In essence the web site 210 is implemented by atiered set of web servers comprising a front-end server 201 standing infront of an originating web server.

[0050] This difference enables the web site or other network service tobe implemented so as to take advantage of the unique topologicalposition each entity has with respect to the client 205. By way of aparticular example, assume an environment in which the front-end server201 is located at the location of an ISP used by a particular set ofclients 205. In such an environment, clients 205 can access thefront-end server 205 without actually traversing the network 101.

[0051] In order for a client 205 to obtain service from a front-end 201,it must first be directed to a front-end 201 that can provide thedesired service. Preferably, client 205 does not need to be aware of thelocation of front-end 201, and initiates all transactions as if it werecontacting the originating server 210. FIG. 3 illustrates a domain nameserver (DNS) redirection mechanism that illustrates how a client 205 isconnected to a front-end 201. The DNS systems is defined in a variety ofInternet Engineering Task Force (IETF) documents such as RFC0883, RFC1034 and RFC 1035 which are incorporated by reference herein. In atypical environment, a client 205 executes a browser 301, TCP/IP stack303, and a resolver 305. For reasons of performance and packaging,browser 301, TCP/IP stack 303 and resolver 305 are often groupedtogether as routines within a single software product.

[0052] Browser 301 functions as a graphical user interface to implementuser input/output (I/O) through monitor 311 and associated keyboard,mouse, or other user input device (not shown). Browser 301 is usuallyused as an interface for web-based applications, but may also be used asan interface for other applications such as email and network news, aswell as special-purpose applications such as database access, telephony,and the like. Alternatively, a special-purpose user interface may besubstituted for the more general-purpose browser 301 to handle aparticular application.

[0053] TCP/IP stack 303 communicates with browser 301 to convert databetween formats suitable for browser 301 and IP format suitable forInternet traffic. TCP/IP stack also implements a TCP protocol thatmanages transmission of packets between client 205 and an Internetservice provider (ISP) or equivalent access point. IP protocol requiresthat each data packet include, among other things, an IP addressidentifying a destination node. In current implementations the IPaddress comprises a 32-bit value that identifies a particular Internetnode. Non-IP networks have similar node addressing mechanisms. Toprovide a more user-friendly addressing system, the Internet implementsa system of domain name servers that map alpha-numeric domain names tospecific IP addresses. This system enables a name space that provides amore consistent reference between nodes on the Internet and avoids theneed for users to know network identifiers, addresses, routes andsimilar information in order to make a connection.

[0054] The domain name service is implemented as a distributed databasemanaged by domain name servers (DNSs) 307 such as DNS_A, DNS_B and DNS_Cshown in FIG. 3. Each DNS relies on <domain name:IP> address mappingdata stored in master files scattered through the hosts that use thedomain system. These master files are updated by local systemadministrators. Master files typically comprise text files that are readby a local name server, and hence become available through the nameservers 307 to users of the domain system.

[0055] The user programs (e.g., clients 205) access name servers throughstandard programs such as resolver 305. Resolver 305 includes an addressof a DNS 307 that serves as a primary name server. When presented with areference to a domain name (e.g., http://www.circadence.com), resolver305 sends a request to the primary DNS (e.g., DNS_A in FIG. 3). Theprimary DNS 307 returns either the IP address mapped to that domainname, a reference to another DNS 307 which has the mapping information(e.g., DNS_B in FIG. 3), or a partial IP address together with areference to another DNS that has more IP address information. Anynumber of DNS-to-DNS references may be required to completely determinethe IP address mapping.

[0056] In this manner, the resolver 305 becomes aware of the IP addressmapping which is supplied to TCP/IP component 303. Client 205 may cachethe IP address mapping for future use. TCP/IP component 303 uses themapping to supply the correct IP address in packets directed to aparticular domain name so that reference to the DNS system need onlyoccur once per connection to a web site.

[0057] In accordance with the present invention, at least one DNS server307 is owned and controlled by system components of the presentinvention. When a user accesses a network resource (e.g., a web site),browser 301 contacts the public DNS system to resolve the requesteddomain name into its related IP address in a conventional manner. In afirst embodiment, the public DNS performs a conventional DNS resolutiondirecting the browser to an originating server 210 and server 210performs a redirection of the browser to the system owned DNS server(i.e., DNC_C in FIG. 3). In a second embodiment, domain:address mappingswithin the DNS system are modified such that resolution of the of theoriginating server's domain automatically return the address of thesystem-owned DNS server (DNS_C). Once a browser is redirected to thesystem-owned DNS server, it begins a process of further redirecting thebrowser 301 to the best available front-end 201.

[0058] Unlike a conventional DNS server, however, the system-owned DNS_Cin FIG. 3 receives domain:address mapping information from a redirectorcomponent 309. Redirector 309 is in communication with front-end manager207 and back-end manager 209 to obtain information on current front-endand back-end assignments to a particular server 210. A conventional DNSis intended to be updated infrequently by reference to its associatedmaster file. In contrast, the master file associated with DNS_C isdynamically updated by redirector 309 to reflect current assignment offront-end 201 and back-end 203. In operation, a reference to web server210 (e.g., http://www.circadence.com) may result in an IP addressreturned from DNS_C that points to any selected front-end 201 that iscurrently assigned to web site 210. Likewise, web site 210 may identifya currently assigned back-end 203 by direct or indirect reference toDNS_C.

[0059] Front-end 201 typically receives information directly fromfront-end manager 207 about the address of currently assigned back-ends203. Similarly, back-end 203 is aware of the address of a front-end 201associated with each data packet. Hence, reference to the domain systemis not required to map a front-end 201 to its appropriate back-end 203.

[0060]FIG. 4 illustrates principle functional components of an exemplaryfront-end 201 in greater detail. Primary functions of the front-end 201include translating transmission control protocol (TCP) packets fromclient 205 into TMP packets used in the system in accordance with thepresent invention. It is contemplated that the various functionsdescribed in reference to the specific examples may be implemented usinga variety of data structures and programs operating at any location in adistributed network. For example, a front-end 201 may be operated on anetwork appliance 107 or server within a particular network 102, 103, or104 shown in FIG. 1. The present invention is readily adapted to anyapplication where multiple clients are coupling to a centralizedresource. Moreover, other transport protocols may be used, includingproprietary transport protocols.

[0061] TCP component 401 includes devices for implementing physicalconnection layer and IP layer functionality. Current IP standards aredescribed in IETF documents RFC0791, RFC0950, RFC0919, RFC0922, RFC792,RFC1112 that are incorporated by reference herein. For ease ofdescription and understanding, these mechanisms are not described ingreat detail herein. Where protocols other than TCP/IP are used tocouple to a client 205, TCP component 401 is replaced or augmented withan appropriate network protocol process.

[0062] TCP component 401 communicates TCP packets with one or moreclients 205. Received packets are coupled to parser 402 where the IPlayer information is extracted. TCP is described in IETF RFC0793, whichis incorporated herein by reference. Each TCP packet includes headerinformation that indicates addressing and control variables, and apayload portion that holds the user-level data being transported by theTCP packet. The user-level data in the payload portion typicallycomprises a user-level network protocol datagram.

[0063] Parser 402 analyzes the payload portion of the TCP packet. In theexamples herein, HTTP is employed as the user-level protocol because ofits widespread use and the advantage that currently available browsersoftware is able to readily use the HTTP protocol. In this case, parser402 comprises an HTTP parser. More generally, parser 402 can beimplemented as any parser-type logic implemented in hardware or softwarefor interpreting the contents of the payload portion. Parser 402 mayimplement file transfer protocol (FTP), mail protocols such as simplemail transport protocol (SMTP), structured query language (SQL) and thelike. Any user-level protocol, including proprietary protocols, may beimplemented within the present invention using appropriate modificationof parser 402.

[0064] To improve performance, front-end 201 optionally includes acaching mechanism 403. Cache 403 may be implemented as a passive cachethat stores frequently and/or recently accessed web pages or as anactive cache that stores network resources that are anticipated to beaccessed. In non-web applications, cache 403 may be used to store anyform of data representing database contents, files, program code, andother information. Upon receipt of a TCP packet, HTTP parser 402determines if the packet is making a request for data within cache 403.If the request can be satisfied from cache 403, the data is supplieddirectly without reference to web server 210 (i.e., a cache hit). Cache403 implements any of a range of management functions for maintainingfresh content. For example, cache 403 may invalidate portions of thecached content after an expiration period specified with the cached dataor by web sever 210. Also, cache 403 may proactively update the cachecontents even before a request is received for particularly important orfrequently used data from web server 210. Cache 403 evicts informationusing any desired algorithm such as least recently used, leastfrequently used, first in/first out, or random eviction. When therequested data is not within cache 403, a request is processed to webserver 210, and the returned data may be stored in cache 403.

[0065] Several types of packets will cause parser 402 to forward arequest towards web server 210. For example, a request for data that isnot within cache 403 (or if optional cache 403 is not implemented) willrequire a reference to web server 210. Some packets will comprise datathat must be supplied to web server 210 (e.g., customer creditinformation, form data and the like). In these instances, HTTP parser402 couples to data blender 404.

[0066] Optionally, front-end 201 implements security processes,compression processes, encryption processes and the like to conditionthe received data for improved transport performance and/or provideadditional functionality. These processes may be implemented within anyof the functional components (e.g., data blender 404) or implemented asseparate functional components within front-end 201. Also, parser 402may implement a prioritization program to identify packets that shouldbe given higher priority service. A prioritization program requires onlythat parser 402 include a data structure associating particular clients205 or particular TCP packet types or contents with a prioritizationvalue. Based on the prioritization value, parser 402 may selectivelyimplement such features as caching, encryption, security, compressionand the like to improve performance and/or functionality. Theprioritization value is provided by the owners of web site 210, forexample, and may be dynamically altered, statically set, or updated fromtime to time to meet the needs of a particular application.

[0067] Blender 404 slices and/or coalesces the data portions of thereceived packets into more desirable “TMP units” that are sized fortransport through the TMP mechanism 202. The data portion of TCP packetsmay range in size depending on client 205 and any intervening linkscoupling client 205 to TCP component 401. Moreover, where compression isapplied, the compressed data will vary in size depending on thecompressibility of the data. Data blender 404 receives information fromfront-end manager 207 that enables selection of a preferable TMP packetsize. Alternatively, a fixed TMP packet size can be set that yieldsdesirable performance across TMP mechanism 202. Data blender 404 alsomarks the TMP units so that they can be re-assembled at the receivingend.

[0068] Data blender 404 also serves as a buffer for storing packets fromall clients 205 that are associated with front-end 201. Blender 404mixes data packets coming into front-end 201 into a cohesive stream ofTMP packets sent to back-end 203 over TMP link 202. In creating a TMPpacket, blender 404 is able to pick and choose amongst the availabledata packets so as to prioritize some data packets over others.

[0069] In an exemplary implementation, a “TMP connection” comprises aplurality of “TCP connection buffers”, logically arranged in multiple“rings”. Each TCP socket maintained between the front-end 201 and aclient 205 corresponds to a TCP connection buffer. When a TCP connectionbuffer is created, it is assigned a priority. For purposes of thepresent invention, any algorithm or criteria may be used to assign apriority. Each priority ring is associated with a number of TCPconnection buffers having similar priority. In a specific example, fivepriority levels are defined corresponding to five priority rings. Eachpriority ring is characterized by the number of connection buffers itholds (nSockets), the number of connection buffers it holds that havedata waiting to be sent (nReady) and the total number of bytes of datain all the connection buffers that it holds (nBytes).

[0070] When composing TMP data packets, the blender goes into a loopcomprising the steps:

[0071] 1) Determine the number of bytes available to be sent from eachring (nBytes), and the number of TCP connections that are ready to send(nReady)

[0072] 2) Determine how many bytes should be sent from each ring. Thisis based on a weight parameter for each priority. The weight can bethought of as the number of bytes that should be sent at each prioritythis time through the loop.

[0073] 3) The nSend value computed in the previous step reflects theweighted proportion that each ring will have in a blended TMP packet,but the values of nSend do not reflect how many bytes need to beselected to actually empty most or all of the data waiting to be sent asingle round. To do this, the nSend value is normalized to the ringhaving the most data waiting (e.g., nBytes=nSendNorm). This involves acalculation of a factor: S=nBytes/(Weight*nReady) for the ring with thegreatest nReady. Then, for each ring, calculate nReady*S*Weight to getthe normalized value (nSendNorm) for each priority ring.

[0074] 4) Send sub-packets from the different rings. This is done bytaking a sub-packet from the highest priority ring and adding it to aTMP packet, then adding a sub-packet from each of the top two queues,then the top three, and so on.

[0075] 5) Within each ring, sub-packets are added round robin. When asub-packet is added from a TCP connection buffer the ring is rotated sothe next sub-packet the ring adds will come from a different TCPconnection buffer. Each sub-packet can be up to 512 bytes in aparticular example. If the connection buffer has less than 512 byteswaiting, the data available is added to the TMP packet.

[0076] 6) When a full TMP packet (roughly 1.5 kB in a particularexample)is built, it is sent. This can have three or more sub packets,depending on their size. The TMP packet will also be sent when there isno more data ready.

[0077] TMP mechanism 405 implements the TMP protocol in accordance withthe present invention. TMP is a TCP-like protocol adapted to improveperformance for multiple channels operating over a single connection.Front-end TMP mechanism 405 and a corresponding back-end TMP mechanism505 shown in FIG. 5 are computer processes that implement the end pointsof TMP link 202. The TMP mechanism in accordance with the presentinvention creates and maintains a stable connection between twoprocesses for high-speed, reliable, adaptable communication.

[0078] TMP is not merely a substitute for the standard TCP environment.TMP is designed to perform particularly well in heterogeneous networkenvironments such as the Internet. TMP connections are made less oftenthan TCP connections. Once a TMP connection is made, it remains upunless there is some kind of direct intervention by an administrator orthere is some form of connection breaking network error. This reducesoverhead associated with setting up, maintaining and tearing downconnections normally associated with TCP.

[0079] Another feature of TMP is its ability to channel numerous TCPconnections through a single TMP pipe 202. The environment in which TMPresides allows multiple TCP connections to occur at one end of thesystem. These TCP connections are then mapped into a single TMPconnection. The TMP connection is then broken down at the other end ofthe TMP pipe 202 in order to traffic the TCP connections to theirappropriate destinations. TMP includes mechanisms to ensure that eachTMP connection gets enough of the available bandwidth to accommodate themultiple TCP connections that it is carrying.

[0080] Another advantage of TMP as compared to traditional protocols isthe amount of information about the quality of the connection that a TMPconnection conveys from one end to the other of a TMP pipe 202. As oftenhappens in a network environment, each end has a great deal ofinformation about the characteristics of the connection in onedirection, but not the other. By knowing about the connection as awhole, TMP can better take advantage of the available bandwidth.

[0081] In contrast with conventional TCP mechanisms, the behaviorimplemented by TMP mechanism 405 is constantly changing. Because TMPobtains bandwidth to host a variable number of TCP connections andbecause TMP is responsive to information about the variable status ofthe network, the behavior of TMP is preferably continuously variable.One of the primary functions of TMP is being able to act as a conduitfor multiple TCP connections. As such, a single TMP connection cannotbehave in the same manner as a single TCP connection. For example,imagine that a TMP connection is carrying 100 TCP connections. At thistime, it loses one packet (from any one of the TCP connections) andquickly cuts its window size in half (as specified for TCP). This is aperformance reduction on 100 connections instead of just on the one thatlost the packet.

[0082] Each TCP connection that is passed through the TMP connectionmust get a fair share of the bandwidth, and should not be easilysqueezed out. To allow this to happen, every TMP becomes more aggressivein claiming bandwidth as it accelerates. Like TCP, the bandwidthavailable to a particular TMP connection is measured by its window size(i.e., the number of outstanding TCP packets that have not yet beenacknowledged). Bandwidth is increased by increasing the window size, andrelinquished by reducing the window size. Up to protocol specifiedlimits, each time a packet is successfully delivered and acknowledged,the window size is increased until the window size reaches a protocolspecified maximum. When a packet is dropped (e.g., no acknowledgereceived or a resend packet response is received), the bandwidth isdecreased by backing off the window size. TMP also ensures that itbecomes more and more resistant to backing off (as compared to TCP) witheach new TCP connection that it hosts. A TMP should not go down to awindow size of less than the number of TCP connections that it ishosting.

[0083] In a particular implementation, every time a TCP connection isadded to (or removed from) what is being passed through the TMPconnection, the TMP connection behavior is altered. It is thisadaptation that ensures successful connections using TMP. Through theuse of the adaptive algorithms discussed above, TMP is able to adapt theamount of bandwidth that it uses. When a new TCP connection is added tothe TMP connection, the TMP connection becomes more aggressive. When aTCP connection is removed from the TMP connection, the TMP connectionbecomes less aggressive.

[0084] TMP pipe 202 provides improved performance in its environment ascompared to conventional TCP channels, but it is recognized that TMPpipe 202 resides on the open, shared Internet in the preferredimplementations. Hence, TMP must live together with many protocols andshare the pipe efficiently in order to allow the other transportmechanisms fair access to the shared communication bandwidth. Since TMPtakes only the amount of bandwidth that is appropriate for the number ofTCP connections that it is hosting (and since it monitors the connectionand controls the number of packets that it puts on the line), TMP willexist cooperatively with TCP traffic. Furthermore, since TMP does abetter job at connection monitoring than TCP and TMP is better suited tothroughput and bandwidth management than TCP.

[0085] Also shown in FIG. 4 are data filter component 406 and HTTPreassemble component 407 that process incoming (with respect to client205) data. TMP mechanism 405 receives TMP packets from TMP pipe 202 andextracts the TMP data units. Using the appended sequencing information,the extracted data units are reassembled into HTTP data packetinformation by HTTP reassembler 407. Data filter component 406 may alsoimplement data decompression where appropriate, decryption, and handlecaching when the returning data is of a cacheable type.

[0086]FIG. 5 illustrates principle functional components of an exemplaryback-end 203 in greater detail. Primary functions of the back-end 203include translating transmission control protocol (TCP) packets from webserver 210 into TMP packets as well as translating TMP packets receivedfrom a front-end 201 into the one or more corresponding TCP packets tobe send to server 210.

[0087] TMP unit 505 receives TMP packets from TMP pipe 202 and passesthem to HTTP reassemble unit 507 where they are reassembled into thecorresponding TCP packets. Data filter 506 may implement otherfunctionality such as decompression, decryption, and the like to meetthe needs of a particular application. The reassembled data is forwardedto TCP component 501 for communication with web server 210.

[0088] TCP data generated by the web server process are transmitted toTCP component 501 and forwarded to HTTP parse mechanism 502. Parser 502operates in a manner analogous to parser 402 shown in FIG. 4 to extractthe data portion from the received TCP packets, perform optionalcompression, encryption and the like, and forward those packets to datablender 504. Data blender 504 operates in a manner akin to data blender404 shown in FIG. 4 to buffer and prioritize packets in a manner that isefficient for TMP transfer. Priority information is received by, forexample, back-end manager 209 based upon criteria established by the website owner. TMP data is streamed into TMP unit 505 for communication onTMP pipe 202.

[0089] Returning again to FIG. 2, in a particular example, eachfront-end servers 201 will each maintain persistent connections to anumber of back-ends, each of which is associated with a destinationserver. Front-ends 201 maintain a list of alternate connection addressesfor back-ends 203 that support various destination sites. Thesealternate connections can be initiated when traffic on another route tothe same destination site has reached capacity, when the connectionroute used by the current connection has deteriorated to the point thatuser performance is degraded beyond a specified acceptable limit, basedon quality of service monitoring of the connection, or when thealternate route is expected to provide better Quality of Service (QOS).QOS monitoring comprises, for example, monitoring the percentage of lostpackets and round trip time (e.g., time elapsed between sending a packetand receive an acknowledge packet). In a first case both routes will beused simultaneously, with the front-end 201 performing load balancingusing performance data reported from the protocol QOS reportingfunctionality and considering current and expected loads on each route.

[0090]FIG. 6 illustrates an alternative embodiment in which the loadbalancing functionality is implemented in a single intermediary server601 located at a front-end (e.g., on the client side of network 101). Inan example operation shown in FIG. 6, front-end 601 may be supported ina network service provider's location 602. Front-end 601 supports eachweb server 610 that together implement a particular web site such as website 210 shown in FIG. 2. Each front-end 601 may support multiple websites as well, although multiple web site support is not shown for easeof description and understanding. Each web server 610 may have its ownglobally unique network address and so be directly accessible vianetwork 101. Alternatively, one or more servers 610 may be coupledtogether by a WAN or LAN having private addresses so that connectionsmust be funneled through a single access point.

[0091] QOS monitor 604 monitors the channel(s) between front-end 601 andeach web server 610. Additionally, front-end 601 may have a preselectedhigh water mark value indicating a maximum number of connections thatcan be in flight to a given web server 610. In yet another alternative,front-end 601 may conduct in-band or out-of-band communication with webservers 610 to determine their status with respect to current load. Inany case, front-end 601 uses the QOS and server load information toselect channels through network 101.

[0092] In a particular example, some or all of web servers 610 providesome set of redundant services such that a given request may besatisfied by more than one server 610. Front-end 601 selects which ofthe capable servers 610 to send a particular request based upon theserver load availability and/or QOS data.

[0093] In another example, front-end 601 receives and buffers multiplerequests for services directed to web servers 610. The buffered requestsmay come from a single client or from multiple browser clients 605.Front-end 601 selects buffered requests for transmission across network101. In accordance with the present invention, the order in whichbuffered requests are selected is determined, at least in part, by therequirements of load balancing amongst the multiple available webservers 610. In other words, if a front-end 601 holds buffered requestsfor two web servers 610, preference will be given to launching requeststo an non-busy server while a buffered request to a busy server mayremain in the buffer for a longer time.

[0094]FIG. 7 illustrates an alternative implementation in which loadbalancing functionality is implemented in a single intermediary server701 located at the back-end (e.g., the server side of network 101).Back-end server 701 receives requests from a variety of clients and/orfront-end web servers 201. Some or all of web servers 610 provide a setof redundant services such that a given request may be satisfied by morethan one server 610. back-end 701 selects which of the capable servers610 to send a particular request based upon the server load and/or QOSdata. Back-end 701 can directly monitor server load as it is aware ofhow many requests are pending at any particular serve 610. Moreover,back-end 701 can be programmed to be aware of the capabilities of eachweb server 610. Hence, when the number of outstanding requests to aparticular server reaches a preselected high water mark, requests can berouted to other servers 610.

[0095] Preferably, back-end server 701 routes requests based not only onvolume, but also based on type of request. Requests for database access,multimedia content delivery, dynamic web page generation, and static webpage delivery vary significantly on the resource load of a server.Back-end server 701 monitors the type of request being made based upon,for example, header information in the received packet and/orinformation determined by parsing the request itself. Using thisinformation, a server with no pending multimedia requests may be favoredover an alternative server 610 with one or two pending multimediarequests.

[0096] As shown in FIG. 7, back-end 701 may include a queuing datastructure 702. Queue 702 holds requests so that they can be applied toservers 610 in a manner that improves performance of server 610. Thesize, timing, and type of request may be used to determine when arequest is released from queue 702 and applied to a server 610. Also,requests can be queued to maintain a substantially consistent number ofpending requests to any given server 610 to improve performance of thatserver.

[0097] In any of the load balancing implementations disclosed herein,the load balancing decision (i.e., which server will receive a givenrequest) can be made not only based upon relative load of the availableservers, but also based upon other criteria. These other criteria mightbe, for example, which available server has the freshest content.Sometimes mirrored web sites, for example, have a disparity betweenmirror copies such that some are maintained more frequently than others.The load balancing web servers in accordance with the present invention,are capable of selecting amongst the mirrors by using knowledge of whichmirror is designed to have the freshest contentment. Hence, the“original” web site may be given a disproportionate volume of trafficfrom a pure load balancing standpoint up until it is unable toefficiently process the volume of requests. After this point is reached,the mirror sites may be used.

[0098] Although the invention has been described and illustrated with acertain degree of particularity, it is understood that the presentdisclosure has been made only by way of example, and that numerouschanges in the combination and arrangement of parts can be resorted toby those skilled in the art without departing from the spirit and scopeof the invention, as hereinafter claimed. For example, while devicessupporting HTTP data traffic are used in the examples, the HTTP devicesmay be replaced or augmented to support other public and proprietaryprotocols and languages including FTP, NNTP, SMTP, SQL and the like. Insuch implementations the front-end 201 and/or back-end 203 are modifiedto implement the desired protocol. Moreover, front-end 201 and back-end203 may support different protocols and languages such that thefront-end 201 supports, for example, HTTP traffic with a client and theback-end supports a DBMS protocol such as SQL. Such implementations notonly provide the advantages of the present invention, but also enable aclient to access a rich set of network resources with minimal clientsoftware.

We claim:
 1. A system for load balancing in a network environmentcomprising: a plurality of servers coupled to a network; a set ofnetwork resources associated with each of the servers, wherein at leastsome of the network resources are redundant; a client coupled to thenetwork and generating a request specifying some of the redundantresources; a gateway machine coupled to the network in communicationwith the client, the gateway machine configured to receive the requestfrom the client, select from amongst the servers that are associatedwith the request-specified redundant services, establish a communicationchannel with the selected server, and access the specified server toservice the received client request; and means coupled to the gatewaymachine for selecting amongst servers of redundant resources aparticular server for a received request so as to balance load amongstthe servers providing redundant resources.
 2. The system of claim 1further comprising means within the gateway machine for selectingamongst a plurality of channels between the gateway machine and theservers associated with the network resources specified by the clientrequest.
 3. The system of claim 1 wherein at least some of the pluralityof servers comprise world wide web servers.
 4. The system of claim 3wherein the client comprises a web browser.
 5. The system of claim 1wherein each of the plurality of servers is associated with a networkaddress and the gateway machine comprises a web server having an networkaddress distinct from the plurality of servers.
 6. The system of claim 1wherein the network comprises the Internet.
 7. The system of claim 1further comprising means for monitoring quality of service between thegateway and each of the servers.
 8. The system of claim 7 wherein themeans for monitoring quality of service is implemented within thegateway machine.
 9. The system of claim 7 further comprising means forselecting from amongst the servers providing redundant services usingthe relative quality of service between the servers.
 10. The system ofclaim 7 further comprising: means for allocating an additional serverwith redundant services in response to the quality of service fallingbelow a preselected level, wherein the gateway machine is configured toestablish a new communication channel through the network with theadditional server.
 11. The system of claim 1 wherein the gateway machinefurther comprises: a means for generating a response to the clientrequest using services provided to the gateway machine by the servers.12. A method for load balancing in a network environment comprising:providing a communication network; providing a plurality of serverscoupled to the network wherein at least some of the servers provideredundant services; generating a request for a redundant service in anetwork-coupled client machine; directing the request to anetwork-coupled gateway machine; causing the gateway machine to selectamongst servers providing the redundant services a particular server forthe received request so as to balance load amongst the plurality ofservers providing the redundant services; and after selecting aparticular server, causing the gateway machine to generate a request tothe selected server for the specified resources.
 13. The method of claim12 selecting amongst a plurality of channels between the gateway machineand the servers of redundant services specified by the client request.14. The method of claim 12 wherein at least some of the plurality ofservers comprise world wide web servers and the client comprises a webbrowser.
 15. The method of claim 12 wherein the network comprises theInternet.
 16. The method of claim 12 further comprising monitoringquality of service between the gateway and each of the servers.
 17. Themethod of claim 16 further comprising selecting from amongst the serversproviding redundant services using the relative quality of servicebetween the servers.
 18. The method of claim 12 further comprising usingthe gateway machine to generate a response to the client request usingservices provided to the gateway machine by the servers.
 19. A systemfor improving performance of a network connected server comprising: anintermediary server receiving requests for server access from aplurality of sources; means for monitoring at least one variableaffecting server performance; and a queue data structure within theintermediary server, wherein the queue data structure is responsive tothe means for monitoring to release requests from the queue datastructure in a manner determined to improve server performance.